Material lessons in machine learning

نویسندگان

چکیده

Science fiction aficionados are aware of the prospect a looming technological singularity - hypothetical point in time at which growth becomes uncontrollable and irreversible, resulting unforeseeable changes to human civilization. Basically, once we hit “robots designed improved by robots.” The consequences its potential benefit or harm race have been intensely debated. This is supported reality near exponential application machine learning (ML) artificial intelligence (AI) across multitude scientific disciplines emergence data science subfield itself (see launch our sister journal Patterns as case-in-point). Materials no exception, with Matter publishing many papers on topic (including works Rosen colleagues Aspuru-Gusik this very issue). Whether “the singularity,” there debate that ML, AI, computational-based informatic approaches heralding new era materials science. Here, I am using informatics along ML AI umbrella terms for practice systematically extracting knowledge from datasets. These datasets can be established/hosted, effectively constructed on-the-fly while tackling particular problem. recent “database” approach traced Genome Initiative (https://www.mgi.gov/), clearly borrowed heavily life sciences such endeavors Protein Data Bank (https://www.rcsb.org/). emerging interest around mining made scientists increasingly eager use algorithms their research. For particular, power informatic-based approaches—be it neural network, virtual screening, /o and/or other algorithms—lies intersection three key elements undergoing development decades: (1) predictive material modeling (think DFT molecular dynamics); (2) building, hosting, accessibility extremely large (emerging cloud computing hosting services), and; (3) brute-force computational (beyond Moore’s Law). Indeed, predicted outcome, well-equipped expand cases. When was grad school (around 2010), personally part “computational materials” group MIT, seems really simplified label retrospect. My advisor lab head, Markus Buehler, saw characterization design, had expertise full atomistic simulation mechanical characterization. At time, fairly substantial cluster, would run, say, tens simulations, iterating protein polymer structure, determine some behavior (perhaps stress-strain response, fracture toughness). Greatly simplifying things, mechanics-driven studies, didn’t need explore complete “design space”—we only needed selection points fit equations, thus limited simulations suffice. Even these were quite computationally expensive state-of-the-art (usually due an explicit waterbox). Making connections between behaviors, one could see general trends hypothetically discovered, but more universal (and expensive) needed. If find underlying patterns structure (such current techniques identify) combined ability search multi-dimensional space, potentially uncover requirements designer properties “on demand.” It exciting ideas, wrote concepts up text, Biomateriomics (the title being influenced -omics time). Some groundwork discussed text still explored Prof. much advanced techniques. He (among others field) vision lacked student over 10 years advance. Regardless, difficult predict advancements past decade. consider myself guy, sophistication grown extensively since last coded snippet MD LAMMPS. environment, better worse. On hand, opportunity apply problems, combine new, methods physical experiments synthesis (i.e., so-called “robochemists”). bit Wild West, lacking standards benchmarks. Of note, issue researchers begun address poste-haste, suggesting best practices reporting norms. Interestingly, efforts mostly driven community approach, ample discussion relevant conference sessions. There also bandwagon effect, groups tacking ML/AI results studies either out-of-place, generic, do not add value larger picture. From editorial perspective, makes almost every manuscript unique challenge assess). definitely keeps us toes. traditional structure-process-property (SPP) vast, continuously growing, high-dimensional landscape, human-intuition ill-suited. That said, follow laws, those laws systematically! While each study attempt understand phenomena (from fundamental applied point-of-view), seen four obvious emerge space: Discovery: Clearly, most attractive uses prowess possible design space materials. easily understood combinatorial manner what metals you make alloy?). Note combinations elemental units, structural arrangement/geometry. endless, so various typically implemented guide process (trained known physics). cloud, successful, cave vaticinium turn out unstable impossible synthesize. Ideally, should unobtanium unless (eventually) lab. Design: Our perspective researched/developed future mind – i.e., intent (see, example, MAP scale Progress Potential descriptions paper). usually comes associated list target specifications/properties/behaviors “optimal” material. leads perhaps common implementation ML/AI/informatics seeking material(s) job. (with objective functions optimize), also, end known, inverse cleverly applied. “reverse engineering” map property components structure. “design” result “discovery”, intention different. sets structures (say, perovskites, example), simplify component options, enable enriched function. tradeoff complexity (how knobs variables consider) power. Datasets: next area exploited ongoing robust databases/datasets, shared among analysis datamining, i.e. natural evolution Initiative. Superficially, involves simply calculating/extracting/measuring pool organizing them homogeneous searched) manner. This, theory, will efficient in-depth analysis, characterization, experimental verification. Beyond collection/compilation, however, finding relationships links previously unknown/unseen nonintuitive. Think polymers act like semiconductive metals, glasses tough steel, novel catalytic agents, etc. A good database digital version multidimensional Ashby-plot change your axes (what interested in), set high-performing candidate emerge. As Data: final little side same coin, data/computational platforms. underlie logic-gates transistors compose clusters conventional reaching limits computation storage capabilities. To reach “singularity”, platforms necessary. We’ve beginnings quantum computing, developments neuromorphic architectures implementing DNA). Parallel, advances used encryption, where generated “data” has one-to-one relationship features. Using store, convey, manipulate information ways lead informatics, reciprocal, self-reinforcing developing designing matter. Ongoing integration “virtual” discovery, extraction even accelerating, alongside (dare say) “tangible” methods. Matter, aims scope defined “nano macro, fundamentals application” (https://www.cell.com/matter/map-scale) assess path. Perhaps, when fully integrated platform, say “singularity” reached ideal demand any application. Time tell if accept manuscripts robot peers.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Machine learning of material behavior

Symbolic machine learning techniques can extract exible and comprehensible knowledge from empirical data of material behavior. The diversity of symbolic machine learning techniques ooers potential to match the requirements of many tasks when models of material behavior need to be created from data. We develop a series of steps for generating material behavior from empirical data and exemplify s...

متن کامل

Interpretable Machine Learning: Lessons from Topic Modeling

Paste the appropriate copyright statement here. ACM now supports three different copyright statements: • ACM copyright: ACM holds the copyright on the work. This is the historical approach. • License: The author(s) retain copyright, but ACM receives an exclusive publication license. • Open Access: The author(s) wish to pay for the work to be open access. The additional fee must be paid to ACM. ...

متن کامل

Machine learning algorithms in air quality modeling

Modern studies in the field of environment science and engineering show that deterministic models struggle to capture the relationship between the concentration of atmospheric pollutants and their emission sources. The recent advances in statistical modeling based on machine learning approaches have emerged as solution to tackle these issues. It is a fact that, input variable type largely affec...

متن کامل

Fairness in Machine Learning: Lessons from Political Philosophy

What does it mean for a machine learning model to be ‘fair’, in terms which can be operationalised? Should fairness consist of ensuring everyone has an equal probability of obtaining some benefit, or should we aim instead to minimise the harms to the least advantaged? Can the relevant ideal be determined by reference to some alternative state of affairs in which a particular social pattern of d...

متن کامل

Machine Learning in Programming by Demonstration: Lessons learned from CIMA

Programming-by-demonstration (PBD) systems learn tasks by watching the user perform them. CIMA1 is an interactive learning system for modeling the data selected and modified by a user as he or she undertakes a task. Part of a PBD system, CIMA is invoked when user actions are matched to find a common description of their “operands.” Although the system’s interfaces to users and applications are ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Matter

سال: 2021

ISSN: ['2604-7551']

DOI: https://doi.org/10.1016/j.matt.2021.04.010